Higher order numerical differentiation 
on the Infinity Computer 

"(<q". Yaroslav D. Sergeyev* 

o 

(N 

C3 \ Abstract 

There exist many applications where it is necessary to approximate nu- 
merically derivatives of a function which is given by a computer procedure. 
In particular, all the fields of optimization have a special interest in such a 
kind of information. In this paper, a new way to do this is presented for a 
new kind of a computer - the Infinity Computer - able to work numerically 
with finite, infinite, and infinitesimal numbers. It is proved that the Infinity 
Computer is able to calculate values of derivatives of a higher order for a 
<— j ' wide class of functions represented by computer procedures . It is shown that 
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the ability to compute derivatives of arbitrary order automatically and accu- 
rate to working precision is an intrinsic property of the Infinity Computer 
related to its way of functioning. Numerical examples illustrating the new 
concepts and numerical tools are given. 
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^ ■ 1 Introduction 

O 

CN ' In many practical applications related to the scientific computing (e.g., in global 

and local optimization, numerical simulation, approximation, etc.) it is necessary 
to calculate derivatives of a function g(x) which is given by a computer procedure 
calculating its approximation f(x). Very often a user working with the computing 
code f(x) is not the person who has written this code. As a result, for the user the 
program calculating y = f(x) is just ablack box, i.e., if it has as the input a value x 
then the program returns the corresponding value y and the user does not know the 
internal structure of the program. As a result, when for solving an applied problem 
the usage of derivatives is required and a procedure for evaluating the exact value 
of f'(x) is not available, we face the necessity to approximate f'(x) in a way. 
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In particular, this situation happens very often in the black box global and local 
optimization (see (5J[TTl[T9j|20l) and related application areas. Let us give a simple 
but important example (see, e.g., HI 9112011231 ) related to the problem of finding the 
minimal root of an equation f(x) = where x 6 [a, b] and f(x) is multiextremal 
(as a result, there can be several different roots over [a, b]), given by a computer 
program and such that /(a) > 0. This problem arises in many applications, such as 
time domain analysis (see Q), filter theory (see |7)), and wavelet theory (see ETlO 
and can be interpreted, for instance, as follows. 

It is necessary to know the behavior of a device over a time interval [a, b]. The 
device starts to work at the time x = a and it functions correctly while for x > a 
the computer procedure calculating f{x) returns values f(x) > 0. Of course, at 
the initial moment, x = a, the device works correctly and f(a) > 0. It is necessary 
either to find an interval [a, x*) such that 

f(x*) = 0, f(x)>0, xe[a,x*), x*E(o,6], (1) 

or to prove that x* satisfying CQ) does not exist in [a, b]. Efficient methods proposed 
recently for solving this problem (see HUH [19] ) strongly use ideas developed in 
the field of global optimization. They require calculating the first derivative f'(x) 
of f(x) and since a program calculating f'(x) is usually not available, the problem 
of finding an approximation of f'(x) arises. 

There exist several approaches to tackle this problem. First, numerical ap- 
proximations are used for this purpose (see e.g., ||9] for a detailed discussion). In 
applications, the following three simple formulae (more complex and numerically 
more expensive approximations can be found in [9j) are often used 

/(I)K /(*+")-/(*) , ,(,), /(*>-/(.-*) . <2) 

f{x) k f ix + H)-f lx -H) (3) 

by practitioners. However, these procedures are fraught with danger (see [9]) since 
eventually round-off errors will dominate calculation. As h tends to zero, both 
f(x + h) and f(x — h) tend to f(x), so that their difference tends to the difference 
of two almost equal quantities and thus contains fewer and fewer significant digits. 
Thus, it is meaningless to carry out these computations beyond a certain threshold 
value of h. Calculations of higher derivatives suffer from the same problems. 

The complex step method (see (HI) allows one to improve approximations of 
f'(x) avoiding subtractive cancellation errors present in (|2}, © by using the fol- 
lowing formula to approximate f'(x) 

f(x) « MA|±M , (4 ) 

where Im(u) is the imaginary part of u. Though this estimate does not involve 
the dangerous difference operation, it is still an approximation of f'(x) because it 
depends on the choice of the step h. 



Another approach consists of the usage of symbolic (algebraic) computations 
(see, e.g., (H) where f(x) is differentiated as an expression in symbolic form in 
contrast to manipulating of numerical quantities represented by the symbols used 
to express f(x). Unfortunately this approach can be too slow when it is applied to 
long codes coming from real world applications. 

There exist an extensive literature (see, e.g., lQ]|2]|5] and references given 
therein) dedicated to automatic (algorithmic) differentiation (AD) that is a set of 
techniques based on the mechanical application of the chain rule to obtain deriva- 
tives of a function given as a computer program. By applying the chain rule of 
derivation to elementary operations this approach allows one to compute deriva- 
tives of arbitrary order automatically with the precision of the code representing 

fix). 

Implementations of AD can be broadly classified into two categories that have 
their advantages and disadvantages (see EJ|5] for a detailed discussion): (i) AD 
tools based on source-to-source transformation changing the semantics by explic- 
itly rewriting the code; (ii) AD tools based on operator overloading using the fact 
that modem programming languages offer the possibility to redefine the seman- 
tics of elementary operators. In particular, the dual numbers extending the real 
numbers by adjoining one new element d with the property d 2 = (i.e., d is nilpo- 
tent) can be used for this purpose (see, e.g., [fl]). Every dual number has the form 
v = a + db, where a and b are real numbers and v can be represented as the ordered 
pair (a, b). On the one hand, dual numbers have a clear similarity with complex 
numbers z = a + ib where i 2 = — 1. On the other hand, speaking informally it can 
be said that the imaginary unit d of dual numbers is a close relative to infinitesi- 
mals (we mean here a general non formalized idea about infinitesimals) since the 
square (or any higher power) of d is exactly zero and the square of an infinitesimal 
is 'almost zero'. 

All the methods described above use traditional computers as computational 
devices and propose a number of techniques to calculate derivatives on them. In 
this paper, a new way to calculate derivatives numerically is proposed. It is made 
by using a new kind of a computer - the Infinity Computer - introduced in |[T3l[T5l 
and able to work numerically with finite, infinite, and infinitesimal quantities. This 
computer is based on a new applied point of view on infinite and infinitesimal 
numbers (that is not related to non-standard analysis) introduced in Ifl2l[l4l . The 
new approach does not use Cantor's ideas and works with infinite and infinitesimal 
numbers being in accordance with Aristotle's principle 'The part is less than the 
whole'. 

We conclude this introduction by emphasizing that traditional approaches for 
differentiation considered above have been developed ad hoc for solving this prob- 
lem as additional tools that should be used together with the traditional computers. 
Without these additional tools the traditional computers are not able to calculate 
derivatives of functions defined by computer procedures. In this paper, it is shown 
that the ability to compute derivatives of arbitrary order automatically and accu- 
rate to working precision is an intrinsic property of the Infinity Computer related 



to its way of functioning. This is just one of the particular features offered to the 
user by the Infinity Computer. Naturally, this is a direct consequence of the fact 
that it can execute numerical computations with infinite and infinitesimal quantities 
explicitly. 

2 Representation of numbers at the Infinity Computer 

In B121ll4[[T6l[T7l . a new powerful numeral system has been developed to express 
finite, infinite, and infinitesimal numbers in a unique framework. The main idea 
consists of measuring infinite and infinitesimal quantities by different (infinite, fi- 
nite, and infinitesimal) units of measure. In this section we give just a brief intro- 
duction to the new methodology that can be found in a rather comprehensive form 
in the survey [ 14] or in the monograph [12] written in a popular manner. 

A new infinite unit of measure has been introduced as the number of ele- 
ments of the set N of natural numbers. It is expressed by a new numeral ® called 
grossone. It is necessary to emphasize immediately that the infinite number ® is 
not either Cantor's Ko or uj and the new approach is not related to the non-standard 
analysis. For instance, one of the important differences consists of the fact that 
infinite integer positive numbers that can be viewed by using numerals including 
grossone can be inteipreted in the terms of the number of elements of certain in- 
finite sets. Another difference consists of the fact that ® has both cardinal and 
ordinal properties as usual finite natural numbers. 

Formally, grossone is introduced as a new number by describing its properties 
postulated by the Infinite Unit Axiom (IUA) (see Ifl2l[l4"l0 . This axiom is added 
to axioms for real numbers similarly to addition of the axiom determining zero to 
axioms of natural numbers when integer numbers are introduced. Inasmuch as it 
has been postulated that grossone is a number, all other axioms for numbers hold 
for it, too. Particularly, associative and commutative properties of multiplication 
and addition, distributive property of multiplication over addition, existence of in- 
verse elements with respect to addition and multiplication hold for grossone as for 
finite numbers. This means that the following relations hold for grossone, as for 
any other number 

0-® = ®-0 = 0, ®-® = 0, 4 = 1, ®° = 1, 1® = 1, 0® = 0. (5) 

To express infinite and infinitesimal numbers at the Infinity Computer, records 
similar to traditional positional numeral systems can be used (see lfT2Ul4lO . Num- 
bers expressed in this new positional systems with the radix ® are called hereinafter 
grossnumbers. In order to construct a number C in this system, we subdivide C 
into groups corresponding to powers of grossone: 

C = c Pm ® p ™ + ... + c Pl ® Pl + c Po ® Po + q,.!®"- 1 + • • • + c p _ k ® p -K (6) 

Then, the record 

C = c Pm ® p ™ . . . Cp^cp^c^®?- 1 . . . c p _ k ® p - k (7) 



represents the number C, where finite numbers a 7^ called grossdigits can be 
positive or negative. They show how many corresponding units should be added or 
subtracted in order to form the number C. Grossdigits can be expressed by several 
symbols. 

Numbers pi in (0 called grosspowers can be finite, infinite, and infinitesimal, 
they are sorted in the decreasing order with po = 

Pm. > Pm-l > • • • > Pi > PO > P-l > ■ ■ -P-(fc-l) > P-k- 

In the record ©, we write ® Pi explicitly because in the new numeral positional 
system the number i in general is not equal to the grosspower p t (see lPT4l for a 
detailed discussion). 

Finite numbers in this new numeral system are represented by numerals having 
only one grosspower po = 0. In fact, if we have a number C such that m = k = 
in representation ©, then due to ©, we have C = cq® = cq. Thus, the number 
C in this case does not contain grossone and is equal to the grossdigit cq being a 
conventional finite number expressed in a traditional finite numeral system. 

The simplest infinitesimal numbers are represented by numerals C having only 
finite or infinite negative grosspowers, e.g., 6.73®~ 4 56.7® -150 . The simplest 
infinitesimal number is 4r = CD" 1 being the inverse element with respect to multi- 
plication for ®: 

©- 1 ■ <S> = ® ■ ® -1 = 1. (8) 

Note that all infinitesimals are not equal to zero. Particularly, -k- > because it is 
a result of division of two positive numbers. 

In the context of the numerical differentiation discussed in this paper, it is worth 
mentioning that (without going in a detailed and rather philosophical discussion on 
the topic 'Can or cannot dual numbers be viewed as a kind of infinitesimals?' ) there 
exist two formal differences between infinitesimals C from © and dual numbers 
(see, e.g., Q). First, for any infinitesimal C it follows C 2 > (for instance, 
(®~ x ) 2 > 0) whereas for dual numbers we have d 2 = 0. Second, in the context 
of HI the element d represented as (0, 1) has not its inverse and infinitesimals C 
have their inverse. 

The simplest infinite numbers are expressed by numerals having positive finite 
or infinite grosspowers. They have infinite parts and can also have a finite part and 
infinitesimal ones. For instance, the number 

1.5® 14 - 2 (-10.645)® 5 7.89® 81®~ 4 - 2 72.8®- 60 

has two infinite parts 1.5® 142 and -10.645® 5 one finite part 7.89®° and two 
infinitesimal parts 81® -4 ' 2 and 72. 8® -60 . All of the numbers introduce above can 
be grosspowers, as well, giving so a possibility to have various combinations of 
quantities and to construct terms having a more complex structure. 

A working software simulator of the Infinity Computer has been implemented 
and the first application - the Infinity Calculator - has been realized. We conclude 



this section by emphasizing the following important issue: the Infinity Computer 
works with infinite, finite, and infinitesimal numbers numerically, not symbolically 
(see HSl). 

3 Numerical differentiation 

Let us return to the problem of numerical differentiation of a function g[x). We 
suppose that a set of elementary functions (sin(x), cos(a;), a x etc.) is represented 
at the Infinity Computer by one of the usual ways used in traditional computers 
(see, e.g. iflOl ) involving the argument x, finite constants, and four arithmetical 
operations. A programmer writes a program P that should calculate g(x) using the 
said implementations of elementary functions, the argument x, and finite constants 
connected by four arithmetical operations. Obviously, P calculates a numerical 
approximation f{x) of the function g{x). As a rule, the programmer does not use 
analytical formulae of f'(x), f"(x), . . . f( k \x) to write the program calculating 
f(x). We suppose that f(x) approximates g(x) sufficiently well with respect to 
some criteria and we shall not discuss the goodness of this approximation in this 
paper. 

Then, as often happens in the scientific computing, a user takes the program P 
calculating f(x) and is interested to calculate f'(x) and higher derivatives numer- 
ically by using this program. Computer programs for calculating f'(x), f"(x), . . . 
f( k \x) and their analytical formulae are unavailable and the internal structure of 
the program calculating f(x) is unknown to the user. 

In this situation, our attention will be attracted to the problem of a numerical 
calculation of the derivatives f'(x), f"(x), . . . f^ k \x) and to the information that 
can be obtained from the computer procedure P calculating f(x) for this purpose 
when it is executed at the Infinity Computer. The following theorem holds. 

Theorem 1 Suppose that: (i) for a function f{x) calculated by a procedure im- 
plemented at the Infinity Computer there exists an unknown Taylor expansion in 
a finite neighborhood 5(y) of a finite point y; (ii) f(x), f'(x), f"(x), . . . f^ k \x) 
assume finite values or are equal to zero for x € 5(y); (Hi) f(x) has been eval- 
uated at a point y + ®~ G &{y)- Then the Infinity Computer returns the result 
of this evaluation in the positional numeral system with the infinite radix ® in the 
following form 

f(y + ®~ x ) = co® ^!®" 1 ^®- 2 . . . c_ {k _ 1} ®-( k -Vc_ k ®- k , (9) 

where 

f(y) = c , f(y) = c_ l5 f"(y) = 2!c_ 2 , . . . f^(y) = k\c_ k . (10) 

Proof. Due to its rules of operation (see ©, ©), the Infinity Computer collects 
different exponents of ® in independent groups c p _ i ® p ~ i with finite grossdigits 



c p _ i when it calculates /(y+® -1 ). Since functions f(x), f'(x), f"(x), . . . f^ k '{x) 
assume finite values or are equal to zero in 5{y) which is also finite, the highest 
grosspower in the number (|9} is necessary less or equal to zero. Thus, the number 
that the Infinity Computer returns can have only a finite and infinitesimal parts. 

Four arithmetical operations (see H14II151 ) executed by the Infinity Computer 
with the operands having finite integer grosspowers in the form (0 produce only 
results with finite integer grosspowers. This fact ensures that the result /(y + ®~ 1 ) 
can have only integer non-positive grosspowers in (0. Due to the rules of the 
positional system (see ©, (O), the number f(y + ® _1 ) from © can be written as 
follows 

f(y + Or 1 ) = co® ^!®" 1 ^®- 2 . . . c_ (fe _ 1) ®-( fc - 1 )c_ fe ®- fe = 

C ®° + ^i®" 1 + C_2®~ 2 + • • • + C.^.!)®-^- 1 ) + C_ fc ®" fc . (11) 

The Infinity Computer while calculates the value f(y + ®~ 1 ) does not use the 
Taylor expansion for f(x), it just executes commands of the program. However, 
this unknown Taylor expansion for f(x) (we emphasize that it is unknown for: 
the Infinity Computer itself, for the programmer, and for the user) exists in the 
neighborhood 5{y) of the point y, for a point x = y + h € 5(y), h > 0. Thus, it 
should be true 

1,2 i,k 

f(y + h) = f(y) + f'(y)h + f"(y) Y + ... + / (fe) (y) ¥ + • • • (12) 

By assuming h = ® _1 in (fT2l ) and by using the fact that ®° = 1 (see (O) we 
obtain 

/(y + ®- 1 ) = /(y)® + /'(y)®- 1 + ^®- 2 + ... + ^M®- fc + ... (13) 

The uniqueness of the Taylor expansion allows us to obtain ® by equating the first 
k+l coefficients of ® in (fT3l) with grossdigits cq, c_i, c_2, • • • c_(j._i), C-k in (fTTT) 
completing so the proof. □ 

Let us comment upon the theorem. It describes a situation where a user needs 
to evaluate f(x) and its derivatives at a point x = y but analytic expressions of 
f(x), f'(x), f"{x), . . . f( k '(x) are unknown and computer procedures for calcu- 
lating f'(x), f"(x), . . . f^ k \x) are unavailable. Moreover, the internal structure of 
the procedure P calculating f(x) can also be unknown to the user. In this situation, 
instead of the usage of, for instance, traditional formulae ©, © for an approxi- 
mation of f'{x), the user evaluates f(x) at the point x = y + ® _1 at the Infinity 
Computer. Note that if P has been written by the programmer for the Infinity Com- 
puter, then the user just runs P without any intervention on the code of P. In the 
case when P has been written for traditional computers, in order to transfer it to 
the Infinity Computer, variables and constants used in P should be just redeclared 
as grossnumbers ©. Traditional arithmetic operations are then overloaded due to 
the rules defined in 111411151 . 



The operation of evaluation of f(x) at the point x = y + 3)" 1 returns a num- 
ber in the form © from where the user can easily obtain values of f(y) and 
f'(y), f"(y), ■ ■ ■ f (y) as shown in (TTOb without any knowledge of the Taylor 
expansion of f(x) and of the analytic formulae and computer procedures for eval- 
uating derivatives. Due to the fact that the Infinity Computer is able to work with 
infinite and infinitesimal numbers numerically, the values f'(y), . . . f^ (y) are cal- 
culated exactly at the point x = y without introduction of dangerous operations ©, 
® (or ©) related to the necessity to use finite values of h when one works with 
traditional computers. We emphasize also that the user obtains the function value 
and the values of the derivatives after calculation of f(x) at a single point. 

It is worthy to notice that numerical operations that the Infinity Computer per- 
forms when it executes the program f(x) can be viewed as an automatic rewriting 
of f(x) from the basis in x into the basis in CD by setting x = y + ® _1 with y being 
a finite number. The numerical finite value of y is then combined with other finite 
numbers present in the program and they all are collected as finite coefficients (i.e., 
grossdigits) of grosspowers of ®. In some sense this is similar to rearrangements 
that often are executed when one works with wavelets (see ETlO or with formal 
power series (see Gil ). 

Let us consider some numerical examples. Their results can be checked by the 
reader directly on systems using symbolic calculations (e.g., MAPLE) by taking 
instead of ®~ x a symbolic parameter, let say, a, thinking about a as an infinites- 
imal number and by calculating then f(y + a) where y is a number. The crucial 
difference of the Infinity Computer with respect to systems executing symbolic 
computations consists of the fact that the Infinity Computer works with infinite, 
finite, and infinitesimal numbers numerically, not symbolically. Naturally, this fea- 
ture of the Infinity Computer becomes veiy advantageous when one should execute 
complex numerical computations. 

Example 1 Suppose that we have a computer procedure implementing the follow- 
ing function g(x) = x 3 as /(x) = x ■ x ■ x and we want to evaluate the values 
/(?/)) f'{y)i f"{y)i an d f (y) at the point y = 5. The Infinity Computer executes 
the following operations 

/(5 + ®~ l ) = 5® !®" 1 • 5® !®" 1 • 5® !®" 1 = 

25®°10®~ 1 1®~ 2 • 5® !®" 1 = 125®°75®~ 1 15®~ 2 1® -3 . (14) 

From A14h by applying < \10i we obtain that 

/(5) = 125, /'(5) = 75, /"(5) = 2! • 15 = 30, /( 3 )(5) = 3! • 1 = 6, 

that are correct values of f(x) and the derivatives at the point y = 5. 

Let us check this numerical result analytically by taking a generic point y. Then 
we obtain 

f(y + ®~ x ) = (y + ®~ l f = (y + ®~ l ) ■ (y + Q' 1 ) ■ (y + ®- x ) = (15) 



y 3 + 2,y 2 ®~ 1 + 3y®~ 2 + ®~ 3 = y 3 ®°3y 2 ® _1 3y®- 2 l®- 3 . (16) 

By applying ftlOi we have the required values 

f(y) = y\ f'(y) = 3y 2 , /" (y) = 2! • 3y = 6y, f®(y) = 3! • 1 = 6. 
That coincide with the respective analytical derivatives calculated at the point x = 

f'(x) = 3x 2 , f"(x) = 6x, / 3 )(x) = 6. D 

Example 2 Suppose that we have the following function g(x) = x + sin(x) and it 
is represented in the Infinity Computer as 

f(x) = x + sin(x), (17) 

where sin(:E) is a computer implementation of sin(:r). If we want to evaluate 
f(x), f'(x), f"(x), and f^'(x) at a point y, by taking k = 3 in ([£]) we obtain 

f(y + ®" x ) = (y + My))®°(i + sln'^))®- 1 ^^®- 2 ^^ 3 , 

where the result depends on the way of implementation of sin(x). For example, 
suppose for the illustrative purpose that in the neighborhood of the point y = the 
Infinity Computer uses the following simple implementation 

, , iA-i iXj tXj 

sm(x) = x 

being the first two items in the corresponding Taylor expansion. Then the computer 
program f(x) becomes 

X ■ X ■ X 



f(x) = X + X 



6 
and the Infinity Computer with y = works as follows 

/(„ + «r') = o + ®- + o + or' - MiMil^l _ 

6 

® -3 

2®" 1 = 2®- 1 (-0.166667)®~ 3 . 

6 

By applying AlOi we have the required values 

/(0) = 0, /'(0) = 2, /"(0) = 2! • = 0, / (3) (0) =3! -(-0.166667) = -1. 

That, obviously, coincide with the respective analytical derivatives (that, we em- 
phasize this fact again, were not used by the Infinity Computer) 

f'(x) = 2-0.5x 2 , f"(x) = -x, /( 3 )(x) = -l 

calculated at the point y = 0. □ 



Example 3 Suppose that we have a computer procedure f{x) = X ' X J~ imple- 
menting the function g{x) = x + and we want to calculate the values f(y), f'(y), 
f"(y), and f( 5 > (y) at a point y = 3. We consider the Infinity Computer that returns 
grossdigits corresponding to the exponents of gross one from to -3. Then we have 

f(3 + &- 1 ) = (3 + ®~ 1 )-(3 + ®~ 1 ) + 1 __ IQ®^©" 1 !©- 2 _ 
[ '~ 3 + ®" 1 ~ 3® !®" 1 

3.333333® 0.888889®- 1 0.037037® -2 - 0.0123457®" 3 . 
By applying f liOD we obtain that 

/(3) = 3.333333, /'(3) = 0.888889, 

/"(3) = 2! • 0.037037 = 0.074074, / (3) (3) = 3! • (-0.0123457) = -0.074074, 
that are values which one obtains by using explicit analytic formulae 

f( x ) = ^±l, f{x) = l-x~\ /"(*) = 2x~ 3 , f^\x) = -6x~ 4 

X 

for f(x) and its derivatives at the point x = 3. □ 
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